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ABSTRACT 


Internet protocols such as Secure Shell and Internet Protocol Security rely on the as- 
sumption that finding discrete logarithms is hard. The protocols specify fixed groups for Diffie- 
Hellman key exchange that must be supported. Although the protocols allow flexibility in the 
choice of group, it is highly likely that the specific groups required by the standards will be 
used in most cases. There are security implications to using a fixed group, because solving any 
discrete logarithm within a group is comparatively easier after a group-specific precomputation 
has been completed. In this work, we more accurately model real-world cryptographic appli- 
cations with fixed groups. We use an analysis of algorithms to place an upper bound on the 


complexity of solving discrete logarithms given a group-specific precomputation. 
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I. Introduction 


Thirty years ago, the field of cryptography was revolutionized when Whitfield Diffie 
and Martin Hellman published New Directions in Cryptography [3]. In this seminal paper, they 
introduced the idea of public key cryptography, a concept that now provides the foundation for 
secure communications and secure financial transactions over the Internet. In the same paper, 
they also described a method for exchanging secret keys over an insecure network. Now known 
as the Diffie-Hellman key exchange, this method is used within common network security pro- 
tocols including Secure Shell (SSH) [28] and Internet Protocol Security (IPsec) [12]. 

The Diffie-Hellman key exchange is an application of group theory. Computing the se- 
cret key requires modular exponentiation: raising a number to an exponent within a group of 
integers modulo a prime number. The inverse operation of modular exponentiation is called 
finding the discrete logarithm. Exponentiation is computationally easy, while finding discrete 
logarithms is believed to be hard. The key exchange depends on this asymmetry in computa- 
tional complexity for its security. If an adversary can compute discrete logarithms, the adversary 
can break Diffie-Hellman and recover the secret key. This situation has lead to a vast amount 
of research toward finding efficient algorithms to solve discrete logarithms and also towards 
understanding the computational complexity of the discrete logarithm problem. 

Algorithms solving discrete logarithms generally can be divided into two phases: a 
precomputation phase and a search phase. The precomputation phase is run first and the result 
is stored in memory. The stored result is used in the search phase to speed up computation of 
the discrete logarithm. Often, the precomputation algorithm requires only the group description. 
This means that the first phase is independent of any particular instance of a discrete logarithm. 
Additional discrete logarithms over the same group can be solved by running just the search 
phase. 

Our work focuses on the efficiency of solving multiple discrete logarithms over the same 
group. The practical importance of this investigation can be seen when we examine how the 


Diffie-Hellman key exchange is used in real applications, such as the SSH and IPsec security 


protocols. Within these protocols, a small number of standard groups are defined. For example, 
the standard for SSH only defines two groups that must be supported. There are valid reasons 
to use standard groups. In particular, when two users exchange keys, using a standard group 
relieves one user from the computational burden of creating a secure group and the other user 
from the need to trust that it has been done securely. Choosing a secure group requires avoiding 
certain groups with characteristics that make them easier to solve. Leaving group choice to a 
standards committee saves the user significant computation time, but the result will be many 
key exchanges occurring over the same fixed groups. This provides an advantage to the attacker 
in that the cost of precomputation for a group can now be amortized over many key exchanges. 
As more exchanges occur under a group, the group precomputation increases in value to an 
attacker. Therefore, our analysis must take into account an attacker that can dedicate large 
parallel systems to the precomputation. 

Typically, a security analysis of discrete logarithm cryptography would consider the 
complexity of the discrete logarithm problem (DLP). However, the DLP is an incomplete model 
for cryptographic applications with fixed groups. In these applications, the group is constant, 
but the DLP treats the group as a variable input to the problem. In the DLP, the problem is to 
find a single discrete logarithm in a given group, however, a precomputation provides no bene- 
fit when solving only one instance. Group-specific precomputation is most valuable when the 
group is reused often, which is the case for standards that specify fixed groups. Current secu- 
rity proofs based on the DLP do not account for group-specific precomputation and, therefore, 
underestimate the difficulty of attacking applications that specify fixed groups. 

In this work, we present a more conservative security model for fixed groups that shows 
that such real-world applications provide less cryptographic strength than previously acknowl- 
edged. In particular, we introduce the para-discrete logarithm problem (PDLP), a variant of the 
DLP where the group is not an input, but rather dependent only on the input size. This allows us 
to model the result of a group-specific precomputation as an advice string. In complexity theory, 
an advice string is roughly a piece of data provided to a help solve a computational problem, 
and the data can be dependent on the size of the input, but not on the input itself. In the standard 
DLP, the precomputation is not an advice string, because it is based on an input: the group. 
Once the precomputation has been completed for a standard group, the DLP is reduced to our 
PDLP with an advice string. 

We use an analysis of algorithms to place an upper bound on the complexity of the para- 
discrete logarithm problem with an advice string. In particular, we provide an analysis of the 


common algorithms for solving discrete logarithms, focusing on the relationship between the 


asymptotic running times of the two phases and the asymptotic bit-length of the advice string. 
Given a group of order N, we show that the generalized para-discrete logarithm problem can be 
solved in O(.N'/3) group operations with an advice string of size O(N'/*). The precomputation 
of such an advice string requires O(N?/*) group operations. 

The rest of the work is as follows. In the next chapter, we review both the technical 
background and the prior research in the field of cryptography that is relevant to understanding 
our work. In Chapter III, we survey the known algorithms for the discrete logarithm problem 
and perform a traditional analysis of their complexity. In Chapter IV, we consider the complex- 
ity of discrete logarithms over fixed groups and reanalyze the discrete logarithm algorithms in 


that context. 


THIS PAGE INTENTIONALLY LEFT BLANK 


II. Background 


In this chapter, we review both the technical background and the prior research in the 
field of cryptography that is relevant to understanding our work. In particular, we begin by 
describing discrete logarithms. Next, we examine their importance in cryptology. Lastly, we 


look at the use of fixed groups in cryptographic protocols. 


A. Discrete Logarithms Explained 


In this section, we describe discrete logarithms. In particular, we first relate discrete 
logarithms to standard logarithms in real numbers. Then, we provide mathematical definitions 
for group exponentiation and discrete logarithms. Next, we provide a simple concrete example 
of discrete logarithms. Lastly, we first define the standard computational problems regarding 
discrete logarithms; that is, the discrete logarithm problem (DLP) and the generalized discrete 
logarithm problem (GDLP). Throughout, we assume the reader is familiar with the concept of 
groups from abstract algebra. 

Discrete logarithms are so named because they are analogous to standard logarithms 
with real numbers. Just as the logarithm is the inverse operation of exponentiation, the discrete 
logarithm is the inverse operation of group exponentiation. In the real numbers, log ga = x if 
g*’ =a. The same is true for discrete logarithms, except g and a are elements of a multiplicative 
cyclic group, G’, with generator g. A cyclic group is a group where all the elements of the group 
can be generated by raising one element, a generator, to successive powers. 


Group exponentiation to a power x € N, can be defined as repeated group multiplication, 
x 
g =|] 9 
1 


Methods such as repeated-squaring [16, Algorithm 2.143] allow group exponentiation to be 


done efficiently, with just lg z multiplications. In the group Z*,, where the group operation is 


multiplication modulo an integer, n, group exponentiation is called modular exponentiation. In 
this setting, the value of g* is a if and only if g” = a mod n. We can compute g” by raising g 
to the power z in the integers, then finding the remainder modulo n. (There are more practical 
algorithms as well [8].) 

Finding a discrete logarithm means inverting the exponentiation and finding the expo- 
nent x given the value, a. That is, given g, n, and a, find a value of 7,0 < «x < n—1, 
such that g” = amodn. While efficient algorithms exist for group exponentiation, no effi- 
cient algorithm is known for computing discrete logarithms. This asymmetry is what makes 


exponentiation useful in public key cryptography. 


1. Discrete Logarithm Example 


To further clarify, we will use a concrete example in Z. The group Z, is the multiplica- 
tive group of integers modulo a prime, p. The elements of Z> are the integers 1,2,...,p — 1. 
In this example, a is an element of the group that can be represented as a = g*, where x 
is an integer, 0 < x < p—1. The discrete logarithm of a to the base g, can be written as 
log a = log,g” = «x. For this example, if we let p = 11 and g = 2, Table 1 shows the pow- 
ers of g. If we look in the table at the row x = 4 we seea = g” = 24 = 16 = 5 mod 11. 
All ten elements of Zi, are generated before we see another | in the table. For every g € Zi 
g?- | = 1 mod p, and, if g is a generator, then there is no element 0 < x < p— 1 such that 
g* = 1 mod p. There is no x < 10 such that 2” = 1 mod 11, so 2 is a generator of Z7,. Also 
note that the values for greater exponents repeat, 2° = 2'° = 1 mod 11. 

When we invert this table we have the discrete logarithms in Z*,,. Table 2 shows us the 


discrete logarithms. For example, looking in the table at a = 5 we find log 25 = 4. 


2. Discrete Logarithm Problem 


Before we can analyze the security of cryptography, it is helpful to formally define the 
computational problems upon which that security relies. The DLP is the problem of solving 
discrete logarithms over the group of integers modulo a prime and can be formalized as fol- 
lows [16], 


Definition 1 The Discrete Logarithm Problem (DLP) 
Input: Prime: p, Generator of Z>: g, Element of LZ: a 





Output: Exponent: «x satisfying g” =a mod p, where 0 <x < p—1. 














x | 27 =amod 11 
0 |} 2° 1 
||, 24 2 
De Nor 4 
By | 22 8 
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6 |)\ 32° 0 
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8 | 28 3 
9 || 2° 6 
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Table 1: Powers of g = 2 in Zi, 


Discrete logarithms can be defined over any cyclic group; they need not be restricted 
to Z. Therefore, the discrete logarithm problem can be generalized to apply to any cyclic 


group [16]. 


Definition 2 The Generalized Discrete Logarithm Problem (GDLP) 
Input: Cyclic Group: G’, Generator of G: g, Element of G' : a 





Output: Exponent: x satisfying g? = a, where 0 < x < |G|. 





B. Cryptography and Discrete Logarithms 


The difficulty of solving discrete logarithms relative to exponentiation makes them very 
useful in cryptographic applications. The security of many common cryptographic applications 
depends on the assumption that solving discrete logarithms is infeasible. The first published 


cryptographic use of discrete logarithms was in the Diffie-Hellman key agreement protocol [3]. 

















a | log,a=2 
1 | logel |0 
2 | loge2 | 1 
3 | logs3 | 8 
4 | log24 | 2 
5 | log25 | 4 
6 | logs6 | 9 
7 | loge? | 7 
8 | log28 | 3 
9 | logs9 | 6 
10 | log,10 | 5 








Table 2: Discrete Logarithms to the Base g = 2 in Z7, 


The first public key cryptosystem relying on discrete logarithms was the ElGamal cryptosys- 
tem [4]. ElGamal also developed the first signature scheme based on discrete logarithms, a 
variant of which is the Digital Signature Algorithm (DSA) [19]. 

In this section, we present several cryptographic algorithms to demonstrate the practical 
importance of discrete logarithms. In particular, we first examine the Diffie-Hellman key agree- 
ment scheme. Next, we focus on the public key encryption system known as ElGamal. Finally, 
we examine the Digital Signature Algorithm (DSA). 


1. Diffie-Hellman Key Agreement 


The Diffie-Hellman key agreement enables two parties to agree on a secret key over an 
insecure channel without revealing the key to an attacker. In this scheme, two participants, A 
and B, agree on a cyclic group, G, and generator of the group, g. We must assume the attacker 
will know the details of the group, as they will be sent over the same insecure channel. A and B 
independently and randomly choose their own secret exponents, a and b, respectively. User A 


computes and transmits g*; B computes and transmits g’. The secret key they agree on is g”. 


User A computes the secret key by raising g’ (received from B) to the power, a (A’s secret), 

(g°)* = 9" =9 

Equivalently, user B raises the value g* to the power, b, 
(g*)’ =9™ 


Now both users know the secret key, g*’, while the attacker has only seen g® and g?. 
Clearly, however, if the attacker could compute discrete logarithms in the group, G, then the 
attacker could solve for either a or b and compute the secret key. It is an open question whether 
there is an easier way to find g*” than to compute discrete logarithms. This is called the Diffie- 
Hellman problem. 

It should also be noted that the Diffie-Hellman key agreement does not provide authen- 
tication. An attacker with the ability to modify and insert messages could be in the middle of 
an exchange between users A and B. If this occurs, A and B could unknowingly be sharing 
keys with the attacker and not each other. To avoid this attack, Diffie-Hellman must be part of 


a larger protocol that provides authentication. 


2. ElGamal 


The ElGamal cryptosystem is a method of public key encryption (PKE) that is based on 
Diffie-Hellman [4]. In this subsection, we show that the security of ElGamal is dependent on 
the difficulty of finding discrete logarithms. In particular, we begin with a formal definition of 
PKE. Then we explain what it means for a PKE to be secure. Next we describe the ElGamal 
algorithms. Finally, we demonstrate how the security of EIGamal would be compromised if an 
efficient discrete logarithm method is discovered. 


Definition 3 Public Key Encryption (PKE) 
A public key encryption system [6] is a triple of PPT algorithms (G, E’, D) such that, 





1. Gis a key generation algorithm that on input 1* computes output (e, d). 
2. E is an encryption algorithm that on input (1*, e, m) computes output c. 
3. Disa decryption algorithm that on input (1*,d,c) computes output m. 


where 1* is the security parameter, e is the public encryption key, d is the secret decryption key, 
m € {0,1}* is the plaintext message and c € {0,1}* is the encrypted ciphertext such that if 
G — (e,d) then D(E(e,m), d) =m. 





The security of a PKE system has been defined in terms of semantic security [6]. Infor- 
mally, a PKE system is semantically secure if an adversary with access to the encryption key, 
e, and ciphertext, c, has no more than a negligible advantage in guessing the plaintext over an 


adversary without acesss to € or c. 


Algorithm 4 ElGamal Key Generation 





Input: Security parameter: 1” 

Output: Public encryption key: e, Secret decryption key: d 
p < k-bit prime such that p — 1 has a large prime factor 
g <= generator of Li, 
a < randomly selected exponent between 0 and p — 1 
d = (p,g,4) 


e = (p,9,9") 
return (c,d) 





In ElGamal key generation, a user A selects a prime, p, that defines the multiplicative 
group, Z’, a generator of that group, g, and a secret exponent a. User A also computes g*. (In 
this, all arithmetic is mod p.) A’s private key, d is (p,g,a), and A’s public key, e, is (p, g, g*). 
As ElGamal initially presented the scheme, the prime and generator be fixed for all users, and 
the public key would be only (g“). He acknowledged that having each user select a prime “‘is 
preferable from the security point of view although that will triple the size of the public file.” [4] 

To encrypt a message for A, user B must represent his message as m, an element of Z>. 
User B must choose a random exponent b and compute c; = g’ and cy = (g*)’m = gm. B 


sends the encrypted message, (Cc, C2), to A. To decrypt, A uses the private key, a, to compute 


10 


Algorithm 5 ElGamal Encryption 
Input: Public encryption key: e = (p,g, g%), Plaintext message: m where 0 < m< p—1 
Output: Encrypted ciphertext: c 

b = randomly selected exponent between 0 and p — 1 

cy < g’ mod p 

Cy = (g*)’m = gm mod p 

c = (ci, C2) 

return c 








Algorithm 6 ElGamal Decryption 





Input: Private decryption key: d = (p, g, a), Encrypted ciphertext: c = (c1,c2) = (g°, gm) 
Output: Decrypted plaintext: m 

g” = (c1)* = (g’)* mod p 

(g%)-+ < inverse of g@ using extended Euclidean algorithm 

m = (g%)"*e2 = (9%) *g%m_ mod p 

return ™m 





C2/c{. Because inverses can be efficiently computed mod p, A can quickly find m. 


C2 gem gm 
— — =m 
ct (oi gv 





As with Diffie-Hellman, ElGamal would be insecure if discrete logarithms could be 
solved efficiently. An adversary with the ability to find discrete logarithms in Z;, could recover 
the private key, a, from the public key, g*. The adversary could then decrypt messages just as 


the valid user can. 


3. Digital Signature Algorithm 


ElGamal also proposed a method for digital signatures in his 1984 paper, A public key 
cryptosystem and a signature scheme based on discrete logarithms [4]. A variation of that 
method, the Digital Signature Algorithm (DSA), was adopted in 1994 as the Digital Signature 
Standard (DSS) [19] and is in common use. In this subsection, we show that the security 
of DSA is dependent on the difficulty of finding discrete logarithms. In particular, we begin 
with a formal definition of a digital signature system. Then we define security for a signature 
scheme. Next, we describe DSA. Finally, we demonstrate how the security of DSA would be 


compromised if an efficient discrete logarithm method is discovered. 
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Definition 7 Digital Signature System 
A digital signature system is a triple of PPT algorithms (G, 5, V) such that, 





1. Gis a key generation algorithm that on input 1* computes output (e, d). 
2. S is a signature generation algorithm that on input (1*,d,m) computes output s. 
3. V is a verification algorithm that on input (e, s,m) computes output v. 


where 1” is the security parameter, e is the public verification key, d is the secret signing 
key, m € {0,1}* is the message to be signed, s € {0,1}* is the signature string and 
uv € {true, false} is the boolean value indicating the validity of the signature, such that if 
G — (e,d) then V(e, S(d,m),m) = true. 





A strong definition of security for a digital signature system is a system that is secure 
against existential forgery under chosen message attack [7]. In a chosen message attack, the 
adversary can choose messages to be signed by the signer. A signature can be existentially 
forged if, in polynomial time, an adversary can create a message and signature that verifies with 


greater than negligible probability even though the message may not be the adversary’s choice. 


Algorithm 8 DSA Key Generation 
Input: Security parameter: 1” 
Output: Public verification key: e, Secret signing key: d 
L,N < bit-lengths of p and q, respectively, to provide security equivalent to k 
p <= L-bit prime modulus 
q < N-bit prime such that q|(p — 1) 
g <= generator of subgroup of Z;, of order q such that 1 < g < p 
x <= randomly selected exponent between 0) and g 
y <=g* mod p 
d = (p,q, 9,2) 


e=(p,4,9,y) 
return (ce, d) 








In DSA, a private key is (p,q, g, x) and a public key is (p,q, g, y), where p, q are prime 
with q|(p — 1), g € Z> is an element of order q, x is a secret exponent, and y = g* mod p. This 
looks similar to keys in ElGamal with the addition of the prime, g. The element g is chosen 
so that it generates the cyclic subgroup of Z, of order q. Note that while the group, Z,,, is not 
fixed for all of DSA, it is also not different for every user. Instead, the values (p,q, g) are called 


domain parameters and are generated and fixed for a particular domain of users. 
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Algorithm 9 DSA Signature Generation 
Input: Message: m, Secret signing key: d = (p,q, g, x), Approved hash function: Hash() 
Output: Signature of m: s = (s’,r’) 

k; < randomly selected exponent between 0 and q 

k~! = inverse of k mod q using extended Euclidean algorithm 

r’ = (g* mod p) mod q 

s' = (k~!(Hash(m) + xr’)) mod q 

& =.(s', 7") 

return s 








Algorithm 10 DSA Signature Verification 
Input: Message: m, Signature: s = (s’,r’), Public verification key: e = (p,q, 9, y), Approved 
hash function: Hash() 
Output: Validity: v, such that v = true <=> s isa valid signature of m 
w <= (s')~! mod q// using extended Euclidean algorithm 
z < Hash(m) 
u, <= zw mod q 
u2 <= r’w mod q 
u' = (gy mod p) mod q 
if v’ = 7’ then 
v <= true 
else 
uv <= false 
end if 
return vu 








DSA security depends on the difficulty of solving discrete logarithms. An efficient 
algorithm for finding discrete logarithms would result in a complete break of DSA. Recovering 
the secret signing key, x, from the public key, y = g” mod p, can be achieved by solving the 
discrete logarithm in Z_ or in the subgroup of Z>, of order gq. 


C. Fixed Groups in Cryptographic Protocols 


Previous algorithms selected a new group for every exchange or key pair, but in practice 
the group is often chosen from a small list of predefined groups. For example, consider a Diffie- 
Hellman key exchange. The two participants must first agree upon a group and a generator. In 
theory, one of the participants could always start by randomly selecting a group at the time of 


the exchange. However, in reality, using common security protocols, the participants will likely 
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agree to use a group that is specified in their protocol standard. 

In this section, we look at the use of fixed groups in cryptography. In particular, we first 
provide examples of two commonly used cryptographic protocols that define specific groups. 
These are Secure Shell [29] and Internet Protocol Security [10]. Then, we examine the motiva- 
tions for specifying fixed groups in protocol standards. Following this, we discuss the security 
risks of reusing groups. 


1. Groups in SSH 


Protocol standards often specify just a few fixed groups for Diffie-Hellman key ex- 
changes. The standard for Secure Shell (SSH) [29] only defines two groups. The two pre- 
defined groups are subgroups of Z;, where p is a specific 1024-bit prime and a 2048-bit prime, 
respectively. The primes were selected by a method described in the OAKLEY key determi- 
nation protocol [20]. In addition to the two required groups, an SSH implementation is free to 
add additional groups. But since both client and server implementations must have a specific 
group predefined, this is essentially a mechanism to add additional standard groups. The SSH 
standard does not require support for on-the-fly group generation. 

There is, however, a proposed Internet standard, Diffie-Hellman Group Exchange for the 
Secure Shell (SSH) Transport Layer Protocol [5], that extends SSH to allow new private groups. 
The standard defines a method for an SSH server to propose a new group to the client. For this 
Diffie-Hellman group exchange extension to be effective, it must be supported by implementa- 
tions and new private groups must actually be created. The popular OpenSSH implements the 
group exchange, but does not automatically generate new groups. Instead, a utility is included 
that allows a server administrator to generate new groups from the command line. Without new 
group generation being automatic and transparent to the user, it is likely that standard groups 


will still be used even between implementations supporting this extension. 


2. Groups in IKE 


The Internet Key Exchange (IKE) [10] is the key exchange protocol used in Internet 
Protocol Security (IPsec). IKE uses the Diffie-Hellman key exchange and specifies just four 
fixed groups. The first two are subgroups of Z;, where p is a 768-bit prime and 1024-bit prime 
respectively. The two primes are chosen by the same Oakley method as in SSH. The other 
two standard groups in IKE are a 155-bit and a 185-bit elliptic curve group. The disparity in 
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bit-lengths is because more efficient algorithms are known for solving discrete logarithms in 
Z,, groups than in well chosen elliptic curve groups. Therefore a smaller elliptic curve group is 


believed to provide security equivalent to a larger Z), group. 


3. Advantages of Fixed Groups 


There are many valid reasons to specify fixed groups for Diffie-Hellman key exchanges 
in a protocol standard. In this subsection, we discuss two major advantages of specifying fixed 
groups. The first advantage we consider is the reduction in protocol complexity. The second 
advantage we examine is that the standard groups can be carefully selected to be secure, saving 


the user the computational expense of creating new secure groups. 


Reduced Protocol Complexity 


If a protocol is shorter and less complex, its security is easier to analyze and there are 
fewer opportunities for flaws. A simpler protocol also makes implementation easier with less 
chance of errors or incompatibilities with other implementations. Using fixed groups reduces 
the protocol complexity. In particular, it eliminates the need to communicate a description of 
the group before the key exchange. Additionally, it eliminates the need for clients to implement 
a method of secure group selection, which as we see in the next section, can be a complicated 


process. 


Securely Selected Groups 


If a group is predefined, it can be carefully selected for desired security properties, and 
the selection is not bound by the computational limitations that would exist if the group selection 
was done during a live protocol transaction. A protocol standard must ensure that the key 
exchange provides an appropriate level of security, and the security provided by a group depends 
on more than just bit-length. Certain groups are weak and must be avoided. Specifically, if the 
group order is the product of only small prime factors, discrete logarithms can be computed 
efficiently in this group [21]. (See Section F on page 24.) 

In both SSH and IKE, the groups were selected with the Oakley method to achieve goals 
of efficiency, security, and trust that there is no back-door. In particular, for an n-bit prime, p, 
the Oakley method fixes the first and last 64-bits to all ones to speedup modular exponentiation. 


Then the interior bits of p are set to (c+ m), where c is the first (n — 128) bits of 7 and m is the 
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smallest positive integer such that p and (p — 1)/2 are both prime. The reason for using 7 as 
the source of randomness is to avoid “any suspicion that the primes have secretly been selected 
to be weak” [20]. 

Additionally, using standard groups eliminates the need for the computationally inten- 
sive process of group creation within the protocol. Creating a new group requires finding a 
prime, p, and a generator, g of Z;. No efficient method is known for finding a generator of Z7, 
for a random prime, p. This is because efficiently determining that a number is a generator re- 
quires knowing the factorization of p — 1, the order of the group, and factorization is believed to 
be a hard problem. Therefore, instead of first choosing a random p, we must generate V = p—1 
with a known factorization and then test that p is a prime. 

To avoid creating a weak group, we want N to have a large prime factor. (Again, see 
Section F on page 24.) Because N is even, our best case is if N = 2q for q prime. Thus, 


* 
p? 


Many iterations of primality testing make this a computationally intensive process. If this had 


to create a secure group Z*, we must select random primes, q;, until p = 2q; + 1 is prime. 
to be done at the start of each transaction, the user may find the long delay unacceptable. If the 
group creation was performed automatically on the server it could potentially enable a denial of 


service attack. 


4. Risks of Using Fixed Groups 


The downside of using a fixed group is that it places a high premium on attacking a 
single group. There will be many key exchanges over the same group over many years. To an 
adversary, the value of solving all discrete logarithms over this fixed group will be much higher 
than the value of solving all discrete logarithms over a random group that may be used only 
once. For example, while the value of decrypting a single bank transaction may be small, the 
value of attacking many simultaneously would be great. 

The computational cost of computing multiple logarithms in a single group is much less 
than computing the same number of logarithms in separate groups. This is because algorithms 
to find discrete logarithms often require a precomputation dependent only on the group. Once 
the precomputation is complete for a group, finding additional discrete logarithms in that group 
is comparatively easy. 

Using fixed groups also allows the time-consuming precomputation to occur before a 
specific key-exchange occurs. Consider a hypothetical attack where the precomputation takes 


one year, but then solving each instance takes just one hour. (The attacker can trade off instance- 
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time for precomputation-time, so such a disparity is not unreasonable.) Given a random group, 
the adversary would always take one year from key exchange to solving the key. However, once 
an adversary has completed the precomputation for a standard group, a key in that group could 
be solved just one hour after the exchange occurs. If the encrypted information is only valuable 
to the attacker for a short period of time, only the second attack is worthwhile. 
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If. Survey of Discrete Logarithm Algorithms 


In this chapter, we survey the known algorithms for solving discrete logarithms and 
perform a traditional analysis of their complexity. In particular, we begin by distinguishing 
between generic algorithms, which work in all cyclic groups, and group-specific algorithms, 
which apply only in certain families of groups. Then we define the model of computation on 
which we will base our analysis. After that, we survey several generic algorithms. Lastly, we 
consider the index calculus algorithm, which is group-specific. For each algorithm we find the 


asymptotic running time and space requirements. 


A. Generic vs. Group-Specific Algorithms 


There are several known algorithms for solving discrete logarithms. In this section, we 
divide the algorithms into two categories, generic algorithms and group-specific algorithms. 
The first category we call generic algorithms, because they apply generally over any type of 
cyclic group. A generic algorithm solves the generalized discrete logarithm problem (GDLP). 
The second category of algorithms are the group-specific algorithms. These are specialized 
algorithms that make use of the structure in the group elements and apply only within certain 
families of groups. 

The generic algorithms we will consider include Shank’s algorithm [16], which is also 
called the Baby-Step Giant-Step algorithm, Pollard’s Rho and Pollard’s Kangaroo algorithms [22]. 
These algorithms apply over any cyclic group including elliptic curve groups and subgroups of 
Z,,, where better methods do not apply. The group-specific algorithms we discuss are index 
calculus algorithms. They apply in Z;,. Therefore, index calculus algorithms solve the standard 
discrete logarithm problem (DLP). 
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B. Model of Computation 


In this section, we define our model of computation. In particular, we begin by defining 
the abstract machine that which will execute the algorithms. Then we define the notation used 
in our analysis. Finally, we explain the format we will use for our analysis of each algorithm. 

For our analysis of algorithms to be consistent we need to define a model of computation. 
Central to that is defining a standard abstract machine for finding the asymptotic running time 
of each algorithm. Our model uses a multitape Turing machine, which is a good model of 
a standard computer. We provide our runtime complexity in terms of the number of group 
operations. We do this because the complexity of the group operation varies among different 
group families. 

Now we define the standard notation we use in our analysis. For each algorithm, we 


have a cyclic group G and a generator g of that group. We let N be the order of g, 


N = |g); 


and let n be the bit-length of NV, 
gle IN Se 0 


n = [log, N]. 


When describing the asymptotic performance of these algorithms, we do so in terms of n, as 
is common practice. In terms of storage, we assume that elements of G' can be represented in 
O(n) bits. This assumption is reasonable because there are less than 2” elements in G. 

In the following sections, we perform a traditional complexity analysis of several known 
algorithms for solving discrete logarithms. Each analysis will follow a standard format. For 
each algorithm we begin with a description of the algorithm itself. Next, we analyze the algo- 
rithm’s runtime complexity. Then we analyze the asymptotic space requirements of the algo- 
rithm. We conclude the chapter with a table summarizing the space and runtime complexity of 


each algorithm. 
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C. Brute-Force Search 


We begin our survey of generic algorithms, with the simplest method, brute-force or 
exhaustive search. That is simply trying every possible exponent (g°, g', g’, ...) until a match is 


found. 


Algorithm 11 Brute-Force Search 
Input: Cyclic Group: G’, Generator: g, Group Element: a 
Output: Exponent: x such that g” = a 

1b 1 

2: 4<=0 
3: while a 4 b do 
4: b=bxg 
5 
6 
7 





rexrt+l 
: end while 
: return x 





In the worst case, where a = g%~!, every exponent would be tested, requiring a total 
of N tests. In terms of the bit-length of the input, n, this requires 2” group operations and 
comparisons in the worst case. In the average case, one can expect to find the correct exponent 
after searching half the space, or 2”~! group operations. In either case, the running time of the 
algorithm is exponential, O(2”), and will quickly become intractable for increasing n. 

On the other hand, the space requirements are minimal. At each step we need only to 
store x and b, and both can be represented in n bits. Therefore, the asymptotic space requirement 


of the brute-force algorithm is O(n). 


D. Precomputed Table Algorithm 


Just two average-case runs of the brute-force search algorithm requires an amount of 
work equivalent to computing all NV exponents. Consider, instead, if one first computed all NV 
exponents and stored them. That is the idea behind our next algorithm, the precomputed table 
algorithm. We build a table holding every discrete logarithm for the group. After computing 
the table, finding an individual discrete logarithm requires just a single table lookup. 

The running time of the algorithm is dominated by the precomputation, which requires 
N group operations. The asymptotic running time of the precomputed table algorithm is O(2”). 


The advantage of the algorithm is the instant solutions of subsequent discrete logarithms in the 


21 


Algorithm 12 Precomputed Table Algorithm 
Input: Cyclic Group: G’, Generator: g, Group Element: a 
Output: Exponent: x such that g” = a 
1: // First build the table such that hash|g"] = x forO <a < N 
2 b=.) 
3: for x = 0 to N —1do 
4: hash|b] = x 
5 b<=bxg 
6 
7 
8 
9 





: end for 

: // Now perform the table lookup 
: & <= hashja] 

: return x 





same group; only a single table lookup is required. 
Of course, this algorithm is infeasible for values of n of cryptologic significance, as it 
is exponential in both time and space complexity. The lookup table holds N values of size n, 


giving an asymptotic size of O(n2"). 


E. Shank’s Algorithm 


Solving discrete logarithms using brute-force search requires O(2") group operations. 
With a precomputed table, we can do it in constant time but require O(n2") bits of storage. 
What if we could find an optimal point between these two extremes? Shank’s Algorithm gives 
us a way to achieve such a balance. 

Shank’s Algorithm is also known as the baby-step giant-step algorithm. The algorithm 
has two stages. In the first stage of the algorithm, we step consecutively through the first X 
powers of g’ : g°,g',g?,...g*—!. These are the “baby-steps”. At each step we store the expo- 
nent, i, in a hash table indexed by g'. After X steps we have a table of discrete logarithms, but 
only for the first X elements of the cyclic group. 

In the second stage, we want to transform the input a = g” into a value that is in our 
range of precomputed discrete logarithms. Starting from g”, we step X elements at a time 
through the cyclic group until we reach the beginning of the cycle where we have precomputed 


the logarithms. To take these “giant-steps”, we simply multiply by g*, 


ae 


Algorithm 13 Shank’s Algorithm 

Input: Cyclic Group: G, Generator: g, Group Element: a, Number of exponents to precom- 
pute: X 

Output: Exponent: x such that g” = a 

1: // Build table hash such that hash{g'] = i for 0 <i < X 

aa 

3: fori = Oto X —1do 

4: hash|b] = i 

5 b<=bxg 

6 

7 

8 

9 





: end for 
: // Now compute successive exponents until you find one in the hash 
:b<=a 
ye 0 
10: h <= hash[b} 
11: while g” 4 bdo 
122 bebx g* 
ye yl 
14: h < hash|b| 
15: end while 
16: cH=h-—yX mod N 
17: return x 





When we find a value in the precomputed range we will have the equation, 


g’ _ grt * 
Now we can solve for x, 
h=x+yX mod N 


x=h-—yX modN 


We are certain to hit a logarithm in the precomputed a range of X consecutive exponents, 
because we are stepping by exactly X exponents at a time. 

Now we will consider the runtime of the algorithm. The first stage requires X group 
operations. The runtime of the second stage will vary depending on the number of giant steps to 
reach the precomputed range of exponents. The most steps will be needed when X < x < 2X, 
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putting x just outside the range of precomputed exponents. In this worst case, the second 
stage will take [=] group operations (multiplications by g*). (We can store the precomputed 
exponents in a hash table to avoid the cost of sorting the table.) 

To minimize the total computation time, we must choose X so that the number of baby- 


steps equal the number of giant-steps. That is when X = [2]. 


cae 
X7=N, 
X=VN, 
X = V2", 
X=22 


Given X = 22, both stages of the algorithm take 2? group operations. Therefore the runtime 
complexity of Shank’s algorithm is O(2?). Although the running time is still exponential, it is 
a significant improvement over the brute-force search. 

The space requirements are a middle ground between the brute-force search and the 
precomputed table algorithms. The table in Shank’s algorithm will require X entries of size 


n-bits. Therefore the space complexity of Shank’s algorithm is O(n2?). 


F. Pohlig-Hellman Algorithm 


The Pohlig-Hellman algorithm makes use of the prime factorization of NV, the order of 
the group. For groups of prime order this algorithm provides no advantage and is equivalent to 
Shank’s algorithm. Our analysis will focus on the case where the order, NV, has only small prime 
factors. This is where the algorithm is most efficient, and this is why some groups are weaker 
than others, motivating standards bodies to include specific “secure” groups in their standards. 

The first step of the Pohlig-Hellman algorithm is to factor, NV, the order of the group. 
When NV has only small prime factors the factorization can be found easily. Let the factorization 
of N = Ve Pi. 

For each unique prime factor, p;, we solve for x; = x mod p;". Once each x; is found 
they can be combined using the Chinese Remainder Theorem to find x, requiring O(k log NV) 


group operations and O(k log NV) space [21]. 
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Algorithm 14 Pohlig-Hellman Algorithm 
Input: Cyclic Group: G, Generator: g, Group Element: a, Order of group: N 
Output: Exponent: x such that g” = a 

1: Find the factorization of N = []_, p" 


2: // For each factor p;" find x; = x mod p;" 
3: fori = 1tokdo 





4 Zea 

& heg? 

6: ¢=(p—1)/p; 

7% gig! 

8: for 7 =0to(n;—1)do 
9 ew 

10: b; <= log,, w // Solve this discrete logarithm using Algorithm 13 
ie ge ohh 

19; fi hPt 

13: q = 4q/Di 

14: end for 

15: i = D1, dip; 

16: end for 


17: Solve for x given x1, ...vz using the Chinese Remainder Theorem 
18: return x 





To find an x;, we find each coefficient, b;, from the representation of 7; = Sam b,p! : 
From Algorithm 14, b; = log,, w, where log is a discrete logarithm. The base g; = g?—V/ ri, 
so the order of the g; is p;. For a group with a large prime factor, p;, the dominant step will be 
finding the discrete logarithm in the subgroup of order p; using Shanks algorithm. 

For small p;, discrete logarithms can be solved with precomputed tables. In the case 
where all p; are small relative to N, the dominant step of the algorithm is computing w = 
z", requiring O(log NV) group operations [21]. The number of times 2” must be computed is 
y n;, which is O(log N) when the prime factors are small. This gives a total running time of 
O(log N)? or O(n?). 


G. Pollard’s Rho Algorithm 


The next algorithm we present, Pollard’s Rho algorithm, has a running time on the same 
order as Shank’s, but does so while avoiding a large stored table. The rho algorithm takes 


advantage of the birthday paradox; that is there is greater than 50% probability that 2 people 
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out of 23 chosen randomly will share a birthday. More generally, when selecting elements at 
random from N elements, a collision will be found after an expected /aN/2 selections [27]. 

To find the discrete logarithm of an element a to the base g, Pollard’s Rho algorithm 
steps through a random sequence of group elements s’ that can be represented as products of 
powers of a and g. 


Lay 


Ss, = a gf =g gf = gre, 


The algorithm searches for a cycle in the sequence, two elements s,,, 5,, u # v such that s,, = s,. 


Solving this equation for x gives the discrete logarithm of a. 


Sy = Sy 


TAutgu LAvt+gu 


g =9 


LAy + Gy = TAay+ Gg, mod N 

LAy — LAy = Gv — Gu Mod N 

X(dy — Gy) = GQ — Gu mod N 
L = (dy — Gal a —g, mod N 


The running time is dominated by a search for a cycle in the sequence. Finding a cycle 
could be accomplished by storing each element in the sequence until one is repeated. This would 
require a large amount of storage, so instead Pollard uses the Floyd cycle-finding algorithm 
which requires storing just two sequence elements s; and s;. The element, s9;, is always twice 
as far into the sequence as s; and a cycle is found when s; = s2;. To advance both sequences, one 
step of the algorithm requires a total of three steps of the sequences. Pollard’s [22] calculations 
gave a mean value for i of 1.08\/N. The asymptotic running time is O(2?) group operations 
and storage of just O(n). 

Algorithm 15 is an improved version of Pollard’s Rho method due to van Oorschot and 
Wiener [27]. Their method finds the cycle by stepping just once through the sequences, pro- 
viding a speedup by a factor of 3. This is possible because they store distinguished points. 
Distinguished points are elements of the group with an easily distinguished property, for exam- 
ple, elements where the first c bits of their binary representation are zeros. We start at a random 
location and step through the sequence until we reach a distinguished point. We store the dis- 
tinguished point and start again from a new random location. When we reach a distinguished 


point that we already have stored, we have found a cycle and can solve for the logarithm. 
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Algorithm 15 Pollard’s Rho Algorithm 
Input: Cyclic Group: G’, Generator: g, Group Element: a, Order of g: N 
Output: Exponent: x such that g” = a 





1: // Search for a cycle in the random sequence S = so, 51, ... defined by Algorithm 16 
2: success < false 
3: D <a subset of distinguished points from G 
4: while (success = false) do 
5:5 10 
6: «a; < randomly selected exponent between 0 and p — 1 
7: g; <= randomly selected exponent between 0 and p — 1 
Bi “cg gta 
9: repeat 

10: i7eitl 

LT: Calculate s;,a;, 9; applying Algorithm 16 

12: until (s; € D) 

13: // If we have already stored this point before 

14: if ((a;, g;) <= hash(s;)) then 

15: success <= true 

16: else 

17: hash(s;) <= (aj, gi) 

18: endif 

19: end while 

20: m =a; —a; mod N 

21: ¢ =m (9g; — g;) mod N 

22: return 2x 





The running time of this version of the rho algorithm is the sum of the time to find a 
collision, 7. plus the time to reach a distinguished point, 7. If we assume the sequence is 
a random mapping, then the expected time to a collision will be T, = ,/aN/2. The time to 
reach a distinguished point depends on the frequency of distinguished points. Given that there 
are cv N distinguished points in the group for some constant c > 1, one of every ee 


Cc 
elements is a distinguished point. The sequence reaches a distinguished point after an expected 
T, = VN 


Cc 


steps. The total expected running time of the rho algorithm is 


N VN 1 N 
Reis we B42) Via yf 
Cc Cc 


The asymptotic running time in terms of the bit-length n is O(22). 


The algorithm needs storage for the distinguished points. The expected number of dis- 


P| 


Algorithm 16 Pollard’s Rho - Random Sequence Algorithm 

Input: Element: s;, Exponents: a;, g;, such that s; = ag” 

Output: Element: s;,,, Exponents: a;41, 9,41, such that s;,, = a%+1g%+1 
1: // Given a partitioning of G into three equal-sized subsets 51, S2, 53 
2: if s; € S, then 





3 Sj41 = aS; 

4 Qia1 =a; +1 mod N 
Qa = Gi 

6: else if s; € S> then 

7 Sin = 87 

8 Qit1 = 2a; mod N 

9: gi41 = 2g; mod N 


10: else if s; € S3 then 

11: Sj41 = GS; 

12: Ait. = A; 

13: G41 = Gi t1 mod N 
14: end if 

15: return $)41, Qj41, Gi41 








tinguished points will be the expected number of steps multiplied by the fraction of elements 


that are distinguished points, 


(WE) Galvin 


For each distinguished point, we store a pair of n-bit exponents. Thus the total expected stor- 
age required by the algorithm is (c\/27 + 2)n bits. Because c is a constant, the total storage 


requirement is O(n). 


H. Pollard’s Kangaroo Algorithm 


Another generic algorithm discovered by Pollard [22] is the kangaroo or lambda method. 
It has a runtime that differs from the Pollard’s Rho method by only a constant. It can also be 
used to find discrete logarithms when the exponent is known to lie in a smaller interval. We 
present an improved version, due to van Oorschot and Weiner [27], that uses distinguished 
points. 

The kangaroo-method gets its name because it can be described with an analogy of two 


kangaroos hopping. If we imagine each element of the the cyclic group as being steps on a 
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Algorithm 17 Pollard’s Kangaroo Algorithm 





Input: Cyclic Group: G,, Generator: g, Group Element: a, Order of g: N 
Output: Exponent: x such that g” = a 


NO NY NY NK RRR RR Re eR 
Se AAO» 300% SOME OME ca O20 ND 


NNN WN 
So TN 


29: 


41: 
42: 


= 
FSO er aN YD 


) 
Aue rice 


: // Select a small sequence of possible step sizes 


S' = (80, 51,..-., 5-1) where s; = 2’ and k such that the mean of the entries is / N 
? 


DRS ((yfivece te) where nr; = 9g" 
: // Select a hash function to map a group element to a particular step size, s; 
: h(x) < hash function mapping G into the interval |1..k'] 


D <a subset of distinguished points from G 


: // The “tame” kangaroo starts off half way through the cycle 


LiF G? 
d, = 0 


: // The “wild” kangaroo starts off from a = g* 
1 Ly Ha 

05S 

: success < false 

: while (success = false) do 


// Step the tame kangaroo one hop 
1A 1) 
5 ae Re He Ge 
dy 3 dt + 8; 
if x, © D then 
// If we have already stored this point for a wild kangaroo 
if ((m, x;, d;) = hash(a,)) && (m = ’wild’) then 
Cte x + d, — d; 
success = true 
else 
hash(x;) <= (tame’, 2+, d:) 
end if 
end if 
// Step the wild kangaroo one hop 
i <= h(ww) 
Ly HF Lyi 
dy <= dy + 8; 
if x,, € D then 
// If we have already stored this point for a tame kangaroo 
if ((m, x;, d;) = hash(x,,)) && (m = ’tame’) then 
LS x + d; > dw 
success = true 
else 
hash(z,,) <= ( wild’, x,,, d) 
end if 
end if 
end while 
return x 
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path, ordered by exponent, (g°, g', g?,...), then each hop of the kangaroo is from one element 
of the group to another. The distance of the hop, s;, is selected from a small set of possible hop 
distances S. The choice of s; is based only on the current position, 7, using a hash function 
h(a) = i. This means that any kangaroo that lands on a particular element will always take the 
same sequence of hops from then on. 

The algorithm uses two kangaroos (sequences), one wild and one tame. The wild one 
starts on element a = g*. Its starting position exponent, x is unknown; it is the discrete log- 
arithm that we are trying to find. We start the tame kangaroo at a known position, halfway 
through the cycle, at g?. We alternate stepping the wild and tame kangaroos. We keep track of 
their respective positions, «,,, x;, and their respective distances traveled, d,,,, d;. 

We want the wild kangaroo to land on the path of the tame kangaroo. Since we know the 
exponent of the tame kangaroo, x + d;, we can calculate the discrete logarithm by subtracting 


the distance traveled by the wild kangaroo, 


log, a = © = > +d, ~ dy mod N 
As the two kangaroos jump, their paths will eventually converge. 

Anytime a kangaroo lands on a distinguished point, we store which kangaroo, the point, 
and the distance traveled in a hash table. If the other kangaroo has already stored this point in 
the hash table, then the paths have converged, and we can solve for the discrete logarithm. The 
use of distinguished points allows us to discover the convergence point quickly while reducing 
memory accesses and storage requirements. Memory only needs to be read or written on the 
small percentage of steps that land on distinguished points. 

To find the runtime and storage requirements, we follow the approximate analysis of 


Pollard [23]. We consider the algorithm as three stages: 

1. The kangaroo in back must catch up with the starting point of the other kangaroo. 

2. The back kangaroo must then land on the path of the other kangaroo. 

3. The back kangaroo must continue until it reaches a distinguished point. 
Throughout each stage, the back kangaroo could be either the wild or the tame kangaroo. 

At the start, the back kangaroo can be at most half a cycle behind and on average will be 
a quarter of a cycle behind or x The mean step size ism = YN. so the back kangaroo needs 


= } vN = vN steps, on average, to catch up to the starting point of the front kangaroo. Given 


that the kangaroos alternate steps, the average running time of Stage 1 is VN group operations. 
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Once the back kangaroo has caught up, it must land on the front kangaroo’s path. Given 
a mean step size, m, one out of every m elements will be on the kangaroo’s path, on average. 
Each hop of the back kangaroo has a at chance of landing on the other kangaroo’s path. Thus 
the kangaroo will land on the path after an expected m hops or 2m total steps of both kangaroos. 
The average running time of Stage 2 is 2m = gun =VJN. 

Now that the back kangaroo is on the same path, it must step until it reaches a distin- 


guished point. Given that there are cy N distinguished points in the group for some constant 
VN 


c > 1, one of every oN = * elements is a distinguished point. The kangaroo will land on 
a distinguished point after an expected = hops. The expected running time of Stage 3 is gv 
group operations. 


Summing the running times of the three stages gives a total expected running time of 





V¥+ V8 +22 -ovN0+4), 


Given that c is large, the running time is approximately 2\/N. The asymptotic running times in 
terms of the bit-length n is O(22). 

The algorithm needs storage for the distinguished points. The expected number of dis- 
tinguished points will be the expected number of steps multiplied by the fraction of elements 


that are distinguished points, 


1 € 1 
2/N(1+—))(—S) = 2c(1 +=) = 2(c+1 
OVN(1 + MF) = Bel + 2) = Ale +) 
For each distinguished point, we store a pair of n-bit quantities: a group element and an integer 
distance. Thus the total expected storage required by the algorithm is 4(c + 1)n bits. Because c 


is a constant, the total storage requirement is O(n). 


I. Index Calculus Algorithm 


The index calculus algorithm takes advantage of the structure of the group elements, 
specifically the fact that group elements can be factored into a product of primes. Unlike the 
generic algorithms that treat the group as a black box and work in any group, the index calculus 
algorithm only applies to groups with the necessary structure, like Z. The algorithm is divided 
into three phases. In the first phase, a number of linear relations are found. In the second phase, 


a solution is found to the system of linear relations. In the final phase, an individual discrete 
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logarithm instance is solved. 

The index calculus algorithm (Algorithm 18) depends on the fact that many elements 
of the group can be represented as the product of a small number of group elements. In the 
case of Z, many integers can be represented as a product of small primes. An integer is called 
B-smooth if it has no prime factors larger than 6. The primes less than B make up a factor 
base, S = (pi, p2,.--, Pr) where there are k primes less than B. A B-smooth integer is one that 
can be represented as a product of the elements of S. 

With an optimal choice for the bound, B, the running time of the index calculus algo- 
rithm is subexponential. That is, it is faster than any algorithm that is exponential in the input 


size. We will use the standard notation for subexponential running times [16], 


L,(a,¢) = O(exp ((c + o(1))(Inp)*(InIn p)'*)), 


where 0 < a < landc > 0. If the first parameter, a, is 0, the algorithm is polynomial with 
degree equal to the second parameter, c. If a is 1, the algorithm is fully exponential. For a 
between 0 and 1, the algorithm is called subexponential. 

When comparing two algorithms using the L,,(a,c) notation, the smaller the value a, 
the shorter the asymptotic running time. If both algorithms have the same a, then the one with 
the smaller value of c will be faster. 

During the first phase of the algorithm, we generate random group elements, g’, by 
randomly selecting exponents, y. We test each element to find any that are b-smooth and factor 
those we find. Then we take the logarithm of the factorization, giving us a linear equation in 
terms of the discrete logarithms of the primes in the factor base. We continue until we have 
more relations than there are unknowns. 

In the second phase, we solve for the / unknowns among the linear relations found in 
Phase 1. Phase 2 is complete when we have a table that holds the discrete logarithms of each of 
the primes in the factor base, table() = log, pi forO <i<k. 

The third phase proceeds much like the first, except now we need to find just one B- 


smooth integer. It will be of the form g*t¥ where x = log, a is the discrete logarithm we are 





trying to solve and y is between 0 and p—1. We randomly select y, until u = ag’ = g*g¥ = g**¥ 





is smooth. Next we find the factorization of wu, 


k 
a+y __ Cj 
g = [[- : 
1=1 
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Algorithm 18 Index Calculus Algorithm 
Input: Prime: p, Generator of Z>: g, Element of Z>: a 
Output: Exponent: «x satisfying g” =a mod p, where0 <x < p—1. 





1: // Setup: Select a factor base 
2: B <a bound for the largest prime in the factor base, S 
3: S <= (pi, p2,---, Pr) where S contains all primes, p; < B 
4: // Phase 1: Find linear relations of the factor base 
5: // Find a few more relations than the size of the factor base to ensure a unique solution 
6: for j = 0tok+cdo 
7: repeat 
8: y <= randomly selected exponent between 0 and p — 1 
9: u <= g¥ mod p 
10: until (uv is B-smooth) 
11: // Find the factorization of u 
Le 2S I] De 
1<i<k 
13: // Take logarithms and store the linear relation 
4, y= Ss" clog, p; (mod p— 1) 
1<i<k 
15: end for 
16: // Phase 2: Solve system of linear relations 
17: Given the k + c relations from Phase 1, solve for the /& unknown discrete logarithms of the 
factor base. 
18: Store the logarithms, such that table(7) = log, pj,1 <7 <k 
19: // Phase 3: Solve for the individual discrete logarithm 
20: repeat 
21: y << randomly selected exponent between 0 and p — 1 
22: u<<=ag’ mod p 
23: until (u is B-smooth) 
24: // Find the factorization of u 


N 
Nn 


[us I] psi 
1<i<k 
26: // Take logarithms of both sides 
27; e+y= S- clog, pi (mod p— 1) 
1<i<k 
235 S- c; table(¢) — y (mod p — 1) 
1<i<k 
29: return x 
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Then taking the discrete logarithm of both sides, 


k 
L+Yy= N° clog, pi (mod p — 1). 
i=1 
The discrete logarithms for any of the small primes, p;, can be read from the table created in 


Phase 2, and thus we can simply solve for x, 


k 
a a (c;table(2) (mod p— 1). 
i=1 


To analyze the running time, we consider the time of each phase separately. The run- 
ning time of the first phase, 7), will be the number of smooth elements that need to be found, 
k, multiplied by F,, the expected number of elements to test to find one B-smooth element, 


multiplied by the time, 7’,, to test one element for smoothness. That is 
T, = kE,T;. 


Solving the linear system requires having as many linear equations as there are un- 
knowns. The unknowns are the logarithms of the factor base. Thus, we need to find as many 
B-smooth elements as there are primes in our factor base. The size of the factor base is the 
number of primes less than B, . 

k=7(B)®& eB 

The expected number of integers to test to find one B-smooth integer depends on the 
distribution of smooth integers. The probability that a random element of Z;, is B-smooth is 
p/w(p, B) where w(p, B) is the number of B-smooth numbers less than p. Thus, we expect to 
find a B-smooth element after testing E., = p/1(p, B) random elements. An approximation for 


the number of B-smooth numbers less than p is 
v(p, B) = pu™, 


where u = log p/ log B. Therefore, 





E, = p/b(p, B) = SE = u" = (log p/ log B)°8?/ "88, 
pu" 
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The time required to test an element for smoothness depends on the method used. The 
simplest method, trial division, will require k divisions where k is the size of the factor base, S. 
A sieving method where many values are tested simultaneously is much more efficient, giving 
atime 7’, = log log B [24]. The total runtime of the Phase 1 is 


LaH=tkht = 





B 
ied (log p/ log B)!8?/!°8 8 log log B. 
The running time of Phase 2 is the time to solve a k x & linear system. Using Gaussian 
elimination would take O(k?) time, but because the system is very sparse there are methods that 


work in O(k?) time [14]. Recall that k ~ race Thus the running time of Phase 2 is 


B 


ee ee 
2 (cE B 


? 





The calculation for the running time of Phase 3, 73, is very similar to that of Phase 1. The 
biggest difference is that in Phase 3 only one smooth element needs to be found. That means 
the sieving approach used in Phase 1 to find many smooth elements simultaneously is not appli- 
cable. Instead, the most efficient method is elliptic curve factorization in time L a(S; V2) [16]. 


This gives a total running time for Phase 3 of 


Ts = BT, = (log p/ log B)"*" "1 (5, V3) 

With an optimal choice for the bound, 6, Adleman [1] showed that the index calcu- 
lus algorithm is subexponential with a running time of Ly (5, c). Coppersmith, Odlyzko, and 
Schroeppel [2] showed that by using sieving methods in Phase 1 a running time of Ly (5; 1) 
could be achieved. Currently the fastest algorithm for solving discrete logarithms in Z> is a 
complex variant of the index calculus algorithm called the number field sieve. The running 
time of the number field sieve for discrete logarithms is Lp(¥; 1.923), but a discussion of this 
algorithm is beyond the scope of this thesis. The storage requirement of the index calculus 
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algorithm is the space needed to represent the system of linear equations. Although the system 
being solved is k x k, the system is very sparse and the zero entries need not be stored. Each 
equation will have fewer than log B non-zero entries of size n. So the asymptotic size of the 


index calculus algorithm is 





n log B= nB. 


B 
log B 


To achieve the optimal running time for the index calculus algorithm the choice of B is L p(s, s). 


J. Summary 


The asymptotic space and running times of the algorithms presented in this chapter are 
summarized in Table 3. The running times of all the generic algorithms are exponential, with the 
three best being O(22). Pollard’s Rho and Pollard’s Kangaroo algorithms achieve this running 
time with only linear storage requirements. The kangaroo method is about 1.60 times slower 
than the best variants of the rho-method, which achieve an expected running time of a [27]. 
The running time of the only group-specific algorithm presented, the index calculus algorithm, 


is subexponential in both running time and in space. 








Algorithm Space | Running Time 
Brute-Force Search O(n) O(2") 
Precomputed Table | O(n2”) O(2”) 

Shank’s | O(n22) O(22) 

Pollard’s Rho O(n) O(22) 
Pollard’s Kangaroo O(n) O(22) 
Index Calculus | L,(3, 3) Ly(3,1) 

















Table 3: Complexity of Discrete Logarithm Algorithms 


Not included in the table is the Pohlig-Hellman algorithm. The dominant step of that 
algorithm is to compute the discrete logarithm in the subgroup of order q, where q is the largest 
prime factor of NV. The runtime of the algorithm will therefore be that of the algorithm to solve 
the discrete logarithm in a prime subgroup. Any generic algorithm could be used for this step. 
Pohlig and Hellman initially suggested Shank’s algorithm [21], but the rho algorithm would be 


the superior choice today. 
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IV. Complexity of Discrete Logarithms over Fixed Groups 


In this chapter, we examine the complexity of discrete logarithms over fixed groups. In 
particular, we introduce the para-discrete logarithm problem, a variant of the discrete logarithm 
problem that more closely models cryptologic applications over fixed groups. Next, we discuss 
how to model an adversary with access to a group-specific precomputation. Then we re-examine 
each algorithm from the previous chapter as a para-discrete logarithm solver. We summarize 
the analysis with a chart showing the run times and precomputation sizes for each algorithm. 
We then apply our analysis of generic algorithms to place an upper bound on the complexity of 
the generalized para-discrete logarithm problem. Finally, we use our analysis of index calculus 


algorithms to place an upper bound on the para-discrete logarithm problem. 


A. The Para-Discrete Logarithm Problem 


Complexity theoretic models are useful for evaluating the security of real world crypto- 
graphic applications. However, a model can also provide a false sense of security if it oversim- 
plifies the implementation details or makes bad assumptions about the capabilities of the adver- 
sary. To illustrate this point, imagine a protocol that implements ElGamal public key encryption. 
This hypothetical protocol requires a user to prove they know their private key by responding 
with the plaintext after receiving an encrypted random number as a challenge. This allows an 
adversary to mount a chosen ciphertext attack. Consider an adversary who intercepts a cipher- 
text, (c, = g’, cg = g*m), encrypted with user A’s public key, g*, and wants to read the secret 





message, m. The adversary selects a random, r, and sends (ci = c, 6 ch =cor amr) to 
1 sae | 





the user A as arandom number challenge. User A will decrypt and return the seemingly random 
message, m’ = mr. From m’, the attacker easily solves for m by multiplying by r~', giving 
m = mrr~'. Thus, if ElGamal is used within a flawed protocol, the difficulty of the discrete 
logarithm problem is irrelevant. A security model must consider the protocol as a whole and 


not just the underlying cryptography. 
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The use of fixed groups in security protocols inspires the question: Do our existing 
models sufficiently capture these applications? In the DLP and GDLP, the group and generator 
are inputs to the problem along with a particular instance to solve. Yet in a fixed group imple- 
mentation, every instance takes place over the same small set of groups. Does this provide an 
advantage to an adversary? We propose a new complexity problem that more closely models 
these fixed group protocols, the para-discrete logarithm problem. 


Definition 19 The Para-Discrete Logarithm Problem (PDLP) 





Setup: Let p = p2,p3,p4,... be an infinite sequence of primes, where p; is a prime of bit- 
length, 7. Let g = go, 93, ga,... be an infinite sequence of integers, where 0 < g; < p; 
and g; generates Zi: 


Input: Security Parameter: 1", Group Element: a € Z>, , where p, € p. 


Output: Exponent: x satisfying g,” =a mod py, where g, € g 





Unlike in the standard discrete logarithm problem, the group and generator are not inputs 
to the para-discrete logarithm problem. Instead, there is just a security parameter, 1", that 
specifies the bit-length of the prime modulus. The prime modulus, p,,, comes from an infinite 
sequence of primes, with exactly one prime for a given bit-length. We contend this problem 
is a better computational model for discrete logarithms over fixed groups, because for a given 
security parameter there is a single defined group. 

Just as the discrete logarithm problem can be generalized from Z>, to any cyclic group 
G, we can generalize the para-discrete logarithm problem. 


Definition 20 The Generalized Para-Discrete Logarithm Problem (GPDLP) 





Setup: Let G = G, Go, G3,... be an infinite sequence of groups, where G;; is a cyclic group 
of order N;, such that 2°"! < N; < 2°. Let g = 91, 92, 93,... be an infinite sequence of 
group elements, where g; € G; and g; generates Gi}. 


Input: Security Parameter: 1", Group Element: a € G,,, where G,, € G. 


Output: Exponent: x satisfying g,,” = a, where g,, € g 





Just as in the PDLP, the group is not an input to the GPDLP problem. Instead, the group 
is determined by the security parameter 1”. For a given n, the group is fixed to G,, where G,, is 


an element of an infinite sequence of groups Gj, Go, .... 
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B. The Para-Discrete Logarithm Problem with an Advice String 


By removing the group as an input to the problem, the PDLP more closely models fixed 
group applications. In applications where groups are generated on the fly and used once, the 
adversary gains no advantage through a precomputation; the precomputation can only be used 
once. In contrast, with fixed groups a precomputation can provide the adversary an advantage 
for all instances over the life of the cryptographic application. 

We model this precomputation as an advice string. In computational complexity theory 
an advice string is an extra input to a computational problem that depends only on the length of 
the input. By the definition of the PDLP, the group is fixed for a given input length. This allows 
us to consider a group-specific computation as producing an advice string for the PDLP. (In the 
standard DLP setting, the precomputation could not be considered an advice string because it is 
dependent on an input to the problem, the specific group.) 

We assert that a conservative approach to evaluating the security of a protocol is to con- 
sider an attack where the adversary has access to a precomputation based only on the protocol 
standard. In the case of a protocol with fixed groups, we should consider an adversary with 
access to a group-specific precomputation. Using the advice-string formalism allows us to bet- 
ter consider the difficulty of solving discrete logarithms once a group-specific precomputation 
has been completed. We can consider the time and space complexity of solving the instance 


separately from the time to create the precomputation. 


C. Para-Discrete Logarithm Algorithms 


In this section, we re-examine each algorithm from the previous chapter as a para- 
discrete logarithm solver. We want to analyze the complexity of the PDLP with an advice 
string. To assist this, we explicitly divide each algorithm into two sub-algorithms: the advice- 
generator (precomputation phase) and the instance-solver (search phase). The advice-generator 
performs a precomputation based only on the group and generator, not the specific problem 
instance. That allows us to treat the precomputation’s output as an advice string for the PDLP. 
The instance-solver searches for the solution of a specific problem instance making use of the 
advice string. For each algorithm, we determine the asymptotic runtime of both phases and the 


size of the advice string. 
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1. Brute-Force Search and Precomputed Table Algorithms 


We begin our re-examination, with the two simplest algorithms. In the brute-force 
search, there is no precomputation done. If we try to modify the brute-force search to have 
a precomputation, we end up with the precomputed table algorithm. The precomputed table 
algorithm very naturally divides into the two algorithms we are looking for. In the advice- 
generator algorithm, the table of all logarithms is built. In the instance-solver algorithm a single 


lookup into the table returns the discrete logarithm. 


Algorithm 21 Precomputed Table: Advice Generator 

Input: Cyclic Group: G’, Generator: g 

Output: Advice string: hash such that hash|g"| = x forO <2 < N 
‘an a 
2: for x = 0 to N —1do 
3: hash|b] = x 
4. bHbxg 
5: end for 
6: return hash 








Algorithm 22 Precomputed Table: Instance Solver 
Input: Group Element: a, Advice string: hash 
Output: Exponent: x such that g” = a 

1: x <= hash|a] 

2: return x 








The advice-generator will require N group multiplications to preform the precomputa- 
tion. The asymptotic running time of the advice-generator is O(2"). The size of the advice 
string (the precomputed hash table containing NV exponents) is exponential in n, O(n2"). After 
the precomputation, the instance solver requires a single table lookup to solve an individual 


discrete log. 


2. Shank’s Algorithm 


The hash table built in Shank’s algorithm is independent of a particular discrete loga- 
rithm instance, so the hash table can act as the advice string. That is, the advice-generator builds 
the hash table. The instance-solver then giant-steps from the input a until it reaches an element 


with a precomputed discrete logarithm in the hash table. 
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Algorithm 23 Shank’s Algorithm: Advice Generator 
Input: Cyclic Group: G’, Generator: g, Number of exponents to precompute: X” 
Output: Advice string: hash such that hash{g'] =i for0 <i< X 

jae ea 
2: for: = 0 to X —1do 
3: hash[b] = i 
4: b=bxg 
2 
6 





: end for 
: return hash 





Algorithm 24 Shank’s Algorithm: Instance Solver 
Input: Group Element: a, Advice string: hash, Number of precomputed logarithms: X 
Output: Exponent: x such that g” = a 
2 D0 
y=0 
h < hash|b] 
while g” 4 bdo 
bebx g* 
yeytl 
h <= hash{b| 
end while 
x<=h—yX mod N 
return x 
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Now we consider the runtime of each algorithm. For general X, the advice-generator 


takes X group operations to generate an advice string with X entries of size n-bits, resulting in 


N 
2, OK 


both the advice generator and instance-solver take 22 group operations. Therefore, the runtime 


an advice string of O(n.X ). The instance-solver takes, on average operations. For X = 22, 
complexity of both algorithms is O(22) with an advice string of O(n27). 

The best choice of X varies depending on the particular trade-offs of a given application. 
A smaller choice for X means a quicker precomputation and a smaller advice string, but at 
the expense of a longer runtime for the instance solver. Likewise, a larger X means quicker 
instance solving, but at the expense of a larger advice-string and a longer runtime for the advice 
generator. Note that for X = 1 we have essentially the brute-force search, and for X = N we 
have the precomputed table algorithm. 

Given a desired number, k;, of instances to solve in a particular group, we can select an X 


to minimize overall computation time. The total computation time is that of one precomputation 


4] 


plus k& instance computations, 


N 
f(X) =X +k. 


To minimize f(X), we find the positive zero of the derivative, 


d kN 
—_f(X)=1- = =0 
qi) 2x2 : 
kN 
axes 
2 — BN 
Do 
reer aaa 
2 


Therefore, the best choice of X to minimize computation time when solving / instances is kN 


or, in terms of the bit-length, n, X = Vk2"=. This gives a total computation time of /2kN 


f V2kN _ | /2N 
k a . 


and an average time per solution o : 


3. Pollard’s Rho Algorithm 


As we examine Pollard’s Rho algorithm, we see that it does not fit the two-phase pattern. 
The algorithm only requires a small amount of storage while running and does not make use 
of a precomputation. The random sequence depends on a particular instance, a = g*, we are 
trying to solve. It it not immediately clear how a instance-independent precomputation could 
assist this algorithm. Using our terminology there is only an instance-solver algorithm and it 
uses no advice string. 

However, [13] analyzes how work can be saved from each instance of the rho algorithm, 
speeding the solution of subsequent instances over the same group. They note that the table 
of distinguished points stored during computation of one logarithm becomes a table of known 
logarithms once that instance has a solution. Those distinguished points can be saved and 
used to assist the next instance and so on. The saved distinguished points are effectively a 
precomputation for solving the next instance. 

This idea can be extended to create a useful advice string, a database of logarithms of 
distinguished points. The advice generator selects random exponents, 7, to create random ele- 


ments, g’, and stores the discrete logarithm each time a distinguished point is found. Additional 
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distinguished points are found until the desired advice size has been reached. Let d be the num- 
ber of distinguished point logarithms we precompute. In one extreme, where d = 0, we have the 
standard rho algorithm. At the other extreme, where d equals the total number of distinguished 
points in the group, we completely remove the benefit of finding cycles in the random sequence; 
the first time we reach a distinguished point, we can solve the logarithm. This extreme is clearly 
inferior to Shank’s algorithm where precomputing each logarithm requires only a single group 


operation. 


Algorithm 25 Pollard’s Rho Algorithm: Advice Generator 
Input: Cyclic Group: G, Generator: g, Order of g: N, Number of logarithms of distinguished 
points to precompute: d 
Output: Advice string: hash such that hash|a% g%| = (a;, 9;) for some a%g% € D 
1: D <a subset of distinguished points from G 
2: // Randomly chose exponents and store logarithms of d distinguished points 
3: for 7 = 1toddo 
4 repeat 
5: i < randomly selected exponent between 0 and p — 1 
6 
7 
8 
9 





beg! 
until (s; € D) 
hash(s;) <= (0,7) 
: end for 
0: return hash 


a 





Kuhn and Struik [13] show that computing a total of X logarithms in the same group 
takes time V2N.X and that the X + 1 logarithm can be computed in time /N/2X for X << 
NVA 

We design our advice generator so that it creates the number of distinguished points, d, 
equivalent to having solved X logarithms. Because the results of Kuhn and Struik are limited 
to X << N‘/4, we let X = N‘/°. The advice generator will take time 


Trdvice = V2NX = ,/2N(N¥5) = VON6/ = V/2N3/°, 
The instance solver will take time 


1 
Linstance = N/2X = y/N/2NWS = NW /2 = TN, 





The size of advice depends on @, the proportion of elements that are distinguished points. For 


the time estimate of the instance solver to be accurate, it must reach many distinguished points. 
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Algorithm 26 Pollard’s Rho Algorithm: Instance Solver 
Input: Group Element: a, Order of g: N, Advice string: hash 
Output: Exponent: x such that g” = a 





1: // Search for a cycle in the random sequence S = so, 51, ... defined by Algorithm 16 
2: success < false 
3: D <a subset of distinguished points from G 
4: while (success = false) do 
5:5 10 
6: «a; < randomly selected exponent between 0 and p — 1 
7: g; <= randomly selected exponent between 0 and p — 1 
Bi “cg gta 
9: repeat 

10: 7eitl 

ae Calculate s;,a;, 9; applying Algorithm 16 

12: until (s; € D) 

13: // If we have already stored this point before 

14: if ((a;, g;) <= hash(s;)) then 

15: success <= true 

16: else 

17: hash(s;) <= (aj, gi) 

18: endif 

19: end while 

20: m =a;—a; mod N 

21: ¢ = m—'(g; — g;) mod N 

22: return x 





Thus we select 9 = c/N?/°. The number of distinguished points stored equals the runtime of 
the advice-generator multiplied by the proportion of distinguished points, 
Trdviced = V2N2/>c/N2> = J2eNV®, 


In terms of the bit-length n, the asymptotic runtime is O(2°*') for the advice-generator 
and O(27°) for the instance-solver, given an advice-string of size O(n25). 
4. Pollard’s Kangaroo Algorithm 


Pollard’s Kangaroo method does not fit the two-phase model, but we can adapt it to this 
setting. Recall that in this method, there are two kangaroos. The wild kangaroo starts at the 


instance we are trying to solve. The tame kangaroo starts from a fixed point on the cycle. Since 
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the tame kangaroo’s behavior is independent of a particular instance, we can step through the 
tame kangaroo in the advice-generator phase. Then in the instance-solver algorithm, we only 


need to step the wild kangaroo. 


Algorithm 27 Pollard’s Kangaroo Algorithm: Advice Generator 

Input: Cyclic Group: G’, Generator: g, Order of g: N, Mean step size: m 
Output: Advice string: hash such that hash|g'] = i for some g' € D 

: // Select a small sequence of possible step sizes 

: S = (80, S1,.--, 8-1) where s; = 2’ and k such that the mean of the entries is m 
R <= (ro,71,---,7k-1) where r; = g* 

: // Select a hash function to map a group element to a particular step size, s; 
: h(a) < hash function mapping G into the interval |1..k'] 

D <a subset of distinguished points from G 

: // The “tame” kangaroo starts off at the beginning of the cycle 

a, = 9° 

d, = 0 

: while (d, < N) do 

// Step the tame kangaroo one hop 

12; i<h(az) 

13: eS UN; 

14: d= d+ s; 

15: if, € D then 





= 
Fe ae Nw OO NON ND ie too ob 


16: // Store the exponent of the distinguished point in the hash table 
ge hash(x,) = d; 
18: endif 


19: end while 
20: return hash 





In the advice-generator, the tame kangaroo steps through the entire cycle, building a 
hash table of all the distinguished points along its path. For our analysis we will use m for the 
mean step size of the values in set S and c for the mean distance between distinguished points. 
The time of the first phase will be x and the storage will be x, 

In the instance-solver, the wild kangaroo steps through the cycle until it reaches a dis- 
tinguished point stored in the advice string. To analyze the runtime we can break down the 


instance solver in to two stages: 


1. The wild kangaroo must first land on the path of the tame kangaroo. 


2. The wild kangaroo must then continue until it reaches a distinguished point. 


In Stage 1, the wild kangaroo must land on the tame kangaroo’s path. Given a mean step size, 
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Algorithm 28 Pollard’s Kangaroo Algorithm: Instance Solver 





Input: Group Element: a, Order of g: N, Advice string: hash 
Output: Exponent: x such that g” = a 


13: 


Se 
Re SRO: OO tral? TON! SONS Pa ee TS 


: // Use the same set S' as in the advice-generator 


S <= (80, $1,---,Sp-1) where s; = 2’ and k such that the mean of the entries is m 
R<=(ro,11,---,7e—-1) where r; = g* 


: // Use the same hash function as in the advice-generator 
: h(a) < hash function mapping G into the interval |1..'] 


D <a subset of distinguished points from G 


: // The “wild” kangaroo starts off from a = g* 


Ly ea 
dy <0 


: success < false 
: while (success = false) do 


// Step the wild kangaroo one hop 
i <= h(ww) 
Ly FH Lyi 
dw = dw + 8; 
if x,, € D then 
// Tf we have already stored this point for a tame kangaroo 
if (d; < hash(z,,)) then 
xz <d;—d, mod N 
success = true 
end if 
end if 


: end while 


return x 





m, each hop of the wild kangaroo has a at chance of landing on the tame kangaroo’s path. Thus 


the kangaroo will land on the path after an expected m hops. 


tinguished point. Given that one of every c elements is a distinguished point, the wild kangaroo 


In Stage 2, the wild kangaroo is on the path of the tame kangaroo and must reach a dis- 


will land on a distinguished point after an expected c hops. 


solver. If we select c = m, the runtime becomes 2m and the size of the advice string is *. 


We can perform a trade-off between the runtime and the size of the advice string by varying m. 


Combining the times from both stages gives a total runtime of m + c for the instance- 


One interesting trade-off is to balance the size of the advice string with the runtime, 


N 


— = 2m. 
Wie 
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Solving for m, 
ee 
2 


,/N 
m= = 


This gives us a advice size and instance-solver time of 2m = W4N. The advice-generator time 
is : 

REL 3 

m WN 
In terms of n, the bit-length of NV, the instance-solver runs in O(2:) time with an O(2) size 
advice string. The advice-generator runs in O(27 ) time. 


5. Index Calculus Algorithm 


The index calculus algorithm needs essentially no changes to match our desired two al- 
gorithm pattern. Phases | & 2 already use only the group description to create a precomputation 
result. Combined, the first two phases become the advice-generator. The table of logarithms 
of the factor base becomes the advice string, and final phase becomes our instance-solver algo- 
rithm. 

The runtime of the advice generator will be the runtime of Phase | and Phase 2 of the 
standard index calculus algorithm, 


B B 
ee ee eB ee! log B)osp/log B log log B + (oe) 


The runtime of the instance generator is that of Phase 3, 


1 
tinstance = T3 = (log p/ log ae ote 2): 


The size of the advice string will be the size of the table of logarithms of the factor base, 


B 


= ‘ 
‘ “og B 





Looking at the structure of the algorithm, there is clearly a large imbalance between 
the runtime of the advice-generator and that of the instance-solver. The instance solver has to 
find only a single B-smooth integer, while the advice-generator must find k + c © k B-smooth 
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Algorithm 29 Index Calculus Algorithm: Advice Generator 
Input: Prime: p, Generator of Z7: g 
Output: Advice string: table such that table(i) = log, p; for0 <i < k. 
: // Setup: Select a factor base, S 
: B <a bound for the largest prime in the factor base 
S <= (pi, p2,---, Px) where S' contains all primes, p; < B 
: // Find linear relations of the factor base 
: // Find a few more relations than the size of the factor base to ensure a unique solution 
: for 7 = 0tok+cdo 
repeat 

y < randomly selected exponent between 0 and p — 1 

u <= g¥ mod p 
until (uv is B-smooth) 
// Find the factorization of u 
u= || vf 

1<i<k 
// Take logarithms and store the linear relation 
y= S- Ci log, Di (mod p— 1) 
1<i<k 

: end for 





- 
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= 
ek WwW 


— Se 


: // Solve system of linear relations 

: Given the k + c linear relations, solve for the k unknown discrete logarithms of the factor 
base. 

19: Store the logarithms, such that table(z) = logspi,l<i<k 

20: return table 


— 
oo 





integers. In addition, half the time of the advice-generator is spent solving the system of linear 


relations. 

pe a 
272 
) while the runtime of the advice-generator is tadvice = 


With B optimized to minimize the precomputation, B = L,(5, 5), the running time of 


the instance solver is tinstance = Ly (5, 5 
Ly(%, 1). Recall 


L,(a, c) = O(exp ((c + o(1))(Inp)*(InInp)'~*)), 
Thus, asymptotically, the instance time runs in just the square root of the advice time. 
The choice of the bound, B, allows tradeoffs between the size of the advice and the 


runtime of the instance solver. A larger B will result in a larger advice string, but make the 


search for B-smooth elements faster for the instance-solver. 
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Algorithm 30 Index Calculus Algorithm: Instance Solver 

Input: Element of Z7: a, Advice string: table 

Output: Exponent: x satisfying g” =a mod p, where0 <x < p—1. 
1: // Solve for the individual discrete logarithm 

2: repeat 

3: y <= randomly selected exponent between 0 and p — 1 
4,5 u<<ag’ mod p 
5 
6 
7 





: until (vu is B-smooth) 
: // Find the factorization of u 
oe I] ps 
1<i<k 
8: // Take logarithms of both sides 
9 c+y= S- clog, p; (mod p — 1) 
1<i<k 
lOc <= S- c; table(i) — y (mod p — 1) 
1<i<k 
11: return x 





D. Summary 


In this section, we summarize the runtimes of the advice-generator and instance-solver 
algorithms for the para-discrete logarithm problem. For the algorithms that allow a time- 
memory trade-off, the formulas governing the trade-offs are shown in Table 4. 























Algorithm Advice-Generator Time Advice Size | Instance-Solver Time 
Shank’s xX nX N/2X 
Pollard’s Rho 2NX cnX J N/2X 
Pollard’s Kangaroo N/m nN/me m+e 
Index Calculus co ( Wee) eb log log B + (cop) NigeB (joe eb La(d, V2) 








Table 4: Time-Memory Trade-Offs of Para-Discrete Logarithm Algorithms 


The entries of Table 5 represent a specific trade-off point where advice size and instance 
time are roughly balanced. Each of the generic algorithms solve the GPDLP. The result from 
the precomputed-table algorithm shows that the GPDLP can be solved in constant time given an 
advice string exponential in size. More interestingly, using Pollard’s Kangaroo algorithm, the 
GPDLP can be solved in O(N) operations with access to advice of size O( WN) elements. 
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The index calculus algorithm places an upper bound on the complexity of the PDLP in Z>. The 


PDLP can be solved in subexponential time, Ly(5; 5), with an advice string of subexponential 


size, L,(3, $). 








Algorithm | Advice-Generator Time | Advice Size | Instance-Solver Time 
Precomputed Table Oo) O(n2") O(1) 
Shank’s O(22) O(n2?) O(22) 

Pollard’s Rho O(2°5') O(2%s) O(2) 
Pollard’s Kangaroo O(27) O(n23) O(23) 
Index Calculus L,(4,1) Ly(55 5) Ly(3, 5) 




















Table 5: Complexity of Para-Discrete Logarithm Algorithms 


In our conservative model of security for protocols over fixed groups, we consider only 
the instance-solver time, assuming a group specific precomputation is available. Under this 
model, the cryptographic strength provided by fixed groups is significantly less than that of 
one-time groups. This disparity is demonstrated when we compare the instance-solver runtimes 
with the standard runtimes from the previous chapter. 

By comparing the results of Table 5 with those of Table 3, we see that the discrete 
logarithm in a particular group is significantly easier given a group-specific advice string. In the 
case of the generic algorithms, the GDLP can be solved in O(22), but given an advice string of 
O(n23) can be solved in O(23) time using the kangaroo instance-solver algorithm. Similarly, 
the time to solve the DLP in Z*, with the index calculus algorithm improves from Ly(5, 1) to 

11 


L,(4, 5) when given an advice string of size L, (5, $). 
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